Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Android malware family classification method based on code image integration
Mo LI, Tianliang LU, Ziheng XIE
Journal of Computer Applications    2022, 42 (5): 1490-1499.   DOI: 10.11772/j.issn.1001-9081.2021030486
Abstract468)   HTML22)    PDF (3025KB)(159)       Save

Code visualization technology is rapidly popularized in the field of Android malware research once it was proposed. Aiming at the problem of insufficient representation ability of code image converted from single DEX (classes.dex) file, a new Android malware family classification method based on code image integration was proposed. Firstly, the DEX, XML (androidManifest.xml) and decompiled JAR (classes.jar) files in the Android application package were converted to three gray-scale images, and the Bilinear interpolation algorithm was used for the scaling of gray images in different sizes. Then, the three gray-scale images were integrated into a three-dimensional Red-Green-Blue (RGB) image for training and classification. In terms of classification model, the Soft Threshold (ST) Block+ResNeSt(STResNeSt) was proposed by combining the soft threshold denoising block with Split-Attention based ResNeSt. The proposed model has the strong anti-noise ability and is able to pay more attention to the important features of code image. To handle the long-tail distribution of data in the training process, Class Balance Loss (CB Loss) was introduced after data augmentation, which provided a feasible solution to the over-fitting caused by the imbalance of samples. On the Drebin dataset, the accuracy of integrated code image is 2.93 percentage points higher than that of DEX gray-scale image, the accuracy of STResNeSt is improved by 1.1 percentage points compared with the Residual Neural Network (ResNet), the scheme of data augmentation combined with CB Loss improves the F1 score by up to 2.4 percentage points. Experimental results show that, the average classification accuracy of the proposed method reaches 98.97%, which can effectively classify the Android malware family.

Table and Figures | Reference | Related Articles | Metrics
Tor website traffic analysis model based on self-attention mechanism and spatiotemporal features
Rongkang XI, Manchun CAI, Tianliang LU, Yanlin LI
Journal of Computer Applications    2022, 42 (10): 3084-3090.   DOI: 10.11772/j.issn.1001-9081.2021081452
Abstract438)   HTML14)    PDF (2633KB)(170)       Save

The onion router (Tor) anonymous communication system is used by criminals to engage in criminal activities on the dark networks, which brings severe challenges to social security. Tor website traffic is captured and analyzed by Tor website traffic analysis technology and therefore illegal behaviors hidden on the internet are timely discovered to conduct network supervision. Based on this, a Tor website traffic analysis model based on Self-Attention and Hierarchical SpatioTemporal (SA-HST) features was proposed on the basis of self-attention mechanism and spatiotemporal features. Firstly, attention mechanism was introduced to assign different weights to the network traffic features to highlight the important features. Then, Convolutional Neural Network (CNN) with multi-channel parallel structure and Long Short-Term Memory (LSTM) network were used to extract the spatiotemporal features of input data. Finally, Softmax function was used to classify data. SA-HST can achieve 97.14% accuracy in closed world scenario, which is 8.74 percentage points and 7.84 percentage points higher compared to CUMUL(CUMULative sum fingerprinting) model and deep learning model CNN. In open world scenario, SA-HST has the evaluation indicators of confusion matrix above 96% stably. Experimental results show that self-attention mechanism can achieve efficient feature extraction under lightweight model structure. By capturing important, multi-view spatiotemporal features of anonymous traffic for classification, SA-HST has certain advantages in terms of classification accuracy, training efficiency and robustness.

Table and Figures | Reference | Related Articles | Metrics